Prepositional Phrase Attachment through a Backed-off Model

نویسندگان

  • Michael Collins
  • James Brooks
چکیده

Recent work has considered corpus-based or statistical approaches to the problem of prepositional phrase a t tachment ambiguity. Typically, ambiguous verb phrases of the form v rip1 p rip2 are resolved through a model which considers values of the four head words (v, n l , p and 77,2). This paper shows that the problem is analogous to n-gram language models in speech recognition, and that one of the most common methods for language modeling, the backed-off estimate, is applicable. Results on Wall Street Journal data of 84.5% accuracy are obtained using this method. A surprising result is the importance of low-count events ignoring events which occur less than 5 times in training data reduces performance to 81.6%. 1 I n t r o d u c t i o n Prepositional phrase a t tachment is a common cause of structural ambiguity in natural language. For example take the following sentence: Pierre Vinken, 61 years old, jo ined the board as a nonexecutive director. The PP 'as a nonexecutive director' can either at tach to the NP ' the board' or to the VP 'joined', giving two alternative structures. (In this case the VP at tachment is correct): NP-attach: (joined ((the board) (as a nonexecutive director))) VP-attach: ((joined (the board)) (as a nonexecutive director)) Work by Ratnaparkhi , Reynar and Roukos [RRR94] and Brill and Resnik [BR94] has considered corpus-based approaches to this problem, using a set of examples to train a model which is then used to make a t tachment decisions on test data. Both papers describe methods which look at the four head words involved in the a t tachment the VP head, the first NP head, the preposition and the second NP head (in this case joined, board, as and director respectively). This paper proposes a new statistical method for PP-a t tachment disambiguation based on the four head words.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Attaching Multiple Prepositional Phrases: Generalized Backed-off Estimation

There has recently been considerable interest in the use of lexically-based statistical techniques to resolve prepositional phrase attachments. To our knowledge, however, these investigations have only considered the problem of attaching the first PP, i.e., in a [V NP PP] configuration. In this paper, we consider one technique which has been successfully applied to this problem, backed-off esti...

متن کامل

Prepositional Phrase Attachment through a Backed-O Model

Recent work has considered corpus-based or statistical approaches to the problem of prepositional phrase attachment ambiguity. Typically, ambiguous verb phrases of the form v np1 p np2 are resolved through a model which considers values of the four head words (v, n1, p and n2). This paper shows that the problem is analogous to n-gram language models in speech recognition, and that one of the mo...

متن کامل

Attaching Multiple Prepositional Phrases: Generalized Backed-oo Estimation

There has recently been considerable interest in the use of lexically-based statistical techniques to resolve preposition-al phrase attachments. To our knowledge , however, these investigations have only considered the problem of attaching the rst PP, i.e., in a V NP PP] conngura-tion. In this paper, we consider one technique which has been successfully applied to this problem, backed-oo estima...

متن کامل

A Maximum Entropy Model for Prepositional Phrase Attachment

For this example, a human annotator's attachment decision, which for our purposes is the "correct" attachment, is to the noun phrase. We present in this paper methods for constructing statistical models for computing the probability of attachment decisions. These models could be then integrated into scoring the probability of an overall parse. We present our methods in the context of prepositio...

متن کامل

Statistical Models for Unsupervised Prepositional Phrase Attachment

We present several unsupervised statistical models for the prepositional phrase attachment task that approach the accuracy of the best supervised methods for this task. Our unsupervised approach uses a heuristic based on attachment proximity and trains h'om raw text that is annotated with only part-oi;speech tags and morphologicM base forms, as opposed to attachment information. It is therefore...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/cmp-lg/9506021  شماره 

صفحات  -

تاریخ انتشار 1995